This paper studies a novel audio segmentation-by-classification approach based on factor analysis. The proposed\ntechnique compensates the within-class variability by using class-dependent factor loading matrices and obtains the\nscores by computing the log-likelihood ratio for the class model to a non-class model over fixed-length windows.\nAfterwards, these scores are smoothed to yield longer contiguous segments of the same class by means of different\nback-end systems. Unlike previous solutions, our proposal does not make use of specific acoustic features and does\nnot need a hierarchical structure. The proposed method is applied to segment and classify audios coming from TV\nshows into five different acoustic classes: speech, music, speech with music, speech with noise, and others. The\ntechnique is compared to a hierarchical system with specific acoustic features achieving a significant error reduction.
Loading....